Dataset Description
A total of 7880 individuals from 2611 families were genotyped on Illumina Human 1Mv1.
- 4901 males, 2979 females.
- 2571 trios, 36 quads, 1 pentas, 3 hexs.
- 947,233 SNPs were genotyped.
- Coordinates were based on Build36.
Raw Genotype QC
Sex Check
- 141 PRROBLEM
- 115 with complete missing chrX genotypes.
- 26 with chrX-F ranging from 0.20 to 0.62
ChrX F distributions

Pariwise IBD estimation
- Relationships (RT): OT (Others), FS (Full Siblings), PO (Parent Offspring)
- family ID 483 has potential issue
- inbreeding coefficient = 1 between IID:328 (Female) and IID:1491 (Female)
- MZ? same individual?
Estimated pairwise IBD distributions

Individual genome-wide heterozygosity
Genome-wide heterozygosity VS missing rates

Genome-wide F VS missing rates

Imputation
Pre-imputation
The imputation pipeline follows that used for SSC dataset. A total of 7769 individuals and ~784K autosomal, ~22K chrX SNPs were used for further impution.
- filters: -- geno 0.05 --mind 0.2 --maf 0.01 --hwe 1e-6
- 111 people removed due to missing genotype data (–mind).
- Total genotyping rate in remaining samples is 0.914029.
- 124565 variants removed due to missing genotype data (–geno).
- 15633 variants removed due to Hardy-Weinberg exact test.
Note that a liberal threshold 0.2 was used for individual genotype missing rates (–mind) for AGP data here since, a large number of individuals with imiss > 0.1. 111 people with imiss ranging from 0.7 to 1.